Integrating the Talkamatic Dialogue Manager with Alexa

نویسندگان

  • Staffan Larsson
  • Alexander Berman
  • Andreas Krona
  • Fredrik Kronlid
چکیده

This paper describes the integration of Amazon Alexa with the Talkamatic Dialogue Manager (TDM), and shows how flexible dialogue skills and rapid prototyping of dialogue apps can be brought to the Alexa platform. 1. Alexa Amazon’s Alexa 1 is a spoken dialogue interface open to third party developers who want to develop their own Alexa ”skills”. Alexa has received a lot of attention and has brought renewed interest to conversational interfaces. It has strong STT (Speech To Text), TTS (Text To Speech) and NLU (Natural Language Understanding) capabilities, but provides less support in the areas of dialogue management and generation, essentially leaving these tasks to the skill developer. See Figure 1 for an overview of the Alexa architecture. An Alexa Skill definition is more or less domain-specific. It also includes generation of natural language output, which makes it language specific. Leaving NLG to the Skill developer works fairly well when performing simple tasks but for domains demanding more complex conversational capabilities, development will be more challenging. Localizing skills to new languages will be another challenge especially if the languages is grammatically more cimplex than English. 2. TDM TDM (Talkamatic Dialogue Manager) [1, 2] is a Dialogue Manager with built-in multimodality, multilinguality, and multi-domain support, and an SDK enabling rapid development of conversational interfaces with a high degree of naturalness and usability. The basic principle behind TDM is separation of concerns – do not mix different kinds of knowledge. TDM keeps the following kinds of knowledge separated from each other: • Dialogue knowledge • Domain knowledge • General linguistic knowledge of a particular language • Domain-specific language • Integration to services and data Dialogue knowledge is encoded in the TDM DME (Dialogue Move Engine). Domain knowledge is declared in the DDD (see below). General linguistic knowledge is described in the Resource Grammar Library. Domain-specific language is described in the DDD-specific grammar. The Service and data integration is described by the Service Interface, a part of the DDD. 1https://developer.amazon.com/alexa 2www.talkamatic.se The dialogue knowledge encoded in TDM enables it to handle a host of dialogue behaviours, including but not limited to: • Overand other-answering (giving more or other information than requested) • Embedded subdialogues (multiple conversational threads) • Task recognition and clarification from incomplete user utterances • Grounding (verification) and correction TDM also supports localisation of applications to new languages (provided that STT and TTS is available). The currently supported and tested languages are English, Mandarin Chinese, Dutch and French. Support for more languages will be added in the future. 3. The relation Alexa – TDM We see the combination of TDM and Alexa as a perfect match. The strengths of the Alexa dialogue platform include the nicely integrated functionality for STT, NLU, and TTS, along with the integration with the Echo hardware. The strengths of TDM are centered on the Dialogue Management component and the multilingual generation. The strengths of the two platforms are thus complementary and non-overlapping. 4. TDM Alexa integration See Figure 2 for an overview of the Alexa-TDM integration. A wrapper around TDM receives intents (e.g. requests and questions) and slots (parameters) from Alexa, which are then translated to their TDM counterparts (request-, askand answermoves) and passed to TDM. The TDM DME (Dialogue Move Engine) then handles dialogue management (updating the information state based on observed dialogue moves, and selecting the best next system move) and the utterance generation (translating the system moves into text), which are then passed back to Alexa using the TDM wrapper. 5. Dialogue Domain Descriptions A TDM application (corresponding roughly to an Alexa skill) is defined by a DDD a Dialogue Domain Description. The DDD is a mostly declarative description of a particular dialogue subject. Firstly, it contains information about what information (basically intentions and slots) is available in a dialogue context, and how this information is related (dialogue plans). Secondly, it contains information about how users and the system speak about this information (grammar). Lastly it contains information about how the information in the dialogue is related to the real world (service interface). Copyright © 2017 ISCA INTERSPEECH 2017: Show & Tell Contribution August 20–24, 2017, Stockholm, Sweden

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Incremental Speech Recognition and POMDP-Based Dialogue Systems

The goal of this paper is to present a first step toward integrating Incremental Speech Recognition (ISR) and Partially-Observable Markov Decision Process (POMDP) based dialogue systems. The former provides support for advanced turn-taking behavior while the other increases the semantic accuracy of speech recognition results. We present an Incremental Interaction Manager that supports the use o...

متن کامل

Dialogue manager domain adaptation using Gaussian process reinforcement learning

Spoken dialogue systems allow humans to interact with machines using natural speech. As such, they have many benefits. By using speech as the primary communication medium, a computer interface can facilitate swift, human-like acquisition of information. In recent years, speech interfaces have become ever more popular, as is evident from the rise of personal assistants such as Siri, Google Now, ...

متن کامل

Integrating OWL Ontologies with a Dialogue Manager

This paper describes the integration of OWL ontologies as external knowledge resources for dialogue systems. The current work focuses on implementing a domain-independent agent whose role is to deal with any ontology without losing expressivity, by using the existing OWL reasoners and the static structure SubjectProperty-Object common to every ontology with OWL format

متن کامل

FLoReS: A Forward Looking, Reward Seeking, Dialogue Manager

We present FLoReS, a new information-state based dialogue manager, making use of forward inference, local dialogue structure, and plan operators representing sub-dialogue structure. The aim is to support both advanced, flexible, mixed initiative interaction and efficient policy creation by domain experts. The dialogue manager has been used for two characters in the SimCoach project, and is curr...

متن کامل

A tractable DDN-POMDP approach to affective dialogue modeling for general probabilistic frame-based dialogue systems

We propose a new approach to developing a tractable affective dialogue model for general probabilistic frame-based dialogue systems. The dialogue model, based on the Partially Observable Markov Decision Process (POMDP) and the Dynamic Decision Network (DDN) techniques, is composed of two main parts, the slot level dialogue manager and the global dialogue manager. Our implemented dialogue manage...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017